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GRAPH EMBEDDING TECHNIQUES FOR BOUNDING CONDITION NUMBERS OF 
INCOMPLETE FACTOR PRECONDITIONERS 

STEPHEN GUATTERY* 


Abstract. 

We extend graph embedding techniques for bounding the spectral condition number of preconditioned 
systems involving symmetric, irreducibly diagonally dominant M- matrices to systems where the precondi- 
tioner is not diagonally dominant. In particular, this allows us to bound the spectral condition number when 
the preconditioner is based on an incomplete factorization. We provide a review of previous techniques, 
describe our extension, and give examples both of a bound for a model problem, and of ways in which our 
techniques give intuitive way of looking at incomplete factor preconditioners. 

Key words, incomplete Cholesky factorization, graph eigenvalues and eigenvectors, preconditioning 

Subject classification. Computer Science 

1. Introduction. The number of iterations required for convergence is an important measure of the 
performance of iterative methods such as conjugate gradient and preconditioned conjugate gradient. In most 
cases, this measure is difficult to determine; however, upper bounds based on the spectral condition number 
can be calculated. 

The spectral condition number is the ratio of the largest to smallest eigenvalues of the matrix for which 
we are computing solutions. In preconditioned systems with matrix A and preconditioner B , this ratio is 
computed for the matrix B~ l A . Calculating spectral condition numbers exactly can require substantial work 
and storage. Hence it is desirable to find a general method that can be applied to wide range of matrices 
and preconditioners. 

Gremban [11] used graph embeddings in bounding condition numbers for preconditioned systems where 
the matrix and preconditioner are symmetric, irreducibly diagonally dominant M-matrices (such matrices 
have positive diagonal and nonpositive off-diagonal entries). He also gave an extension that allows positive 
off-diagonal entries as long as the matrix remains diagonally dominant. His techniques are often easy to apply 
and often give good bounds. They can take advantage of the wide variety of embedding results developed 
in the study of networks. However, they apply only to a restricted set of matrices. 

In this paper, we show how embedding techniques can be extended to handle a class of positive defi- 
nite preconditioners that may not be diagonally dominant. In particular, this allows us to apply them to 
preconditioners formed by incomplete Cholesky factorization techniques. In addition to allowing broader 
application of the embedding techniques, our extension also provides some nice intuitive interpretations of 
incomplete factor preconditioners. We give an example at the end of this paper. 

The rest of this paper is organized as follows: Section 2 covers previous work that relates graph em- 
beddings and eigenvalue bounds; Section 3 presents the notation and terminology we use; Section 4 reviews 
embedding techniques for generalized Laplacians; Section 5 presents our techniques for extending these 
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techniques to preconditioners based on incomplete factorizations; and Section 6 presents some examples to 
illustrate how these extensions can be used. 

2. Previous Work. The study of the connection between Laplacian spectra (particularly with respect 
to A 2 , the smallest nontrivial eigenvalue of a Laplacian matrix) and properties of the associated graphs dates 
back to Fiedler’s work in the 1970’s (see, e.g., [7] and [8]). 

The relationship between graph embeddings and matrix representations has been the subject of much 
interesting research. A large proportion of this work has been aimed at bounding the second largest eigenval- 
ues of time- reversible Markov chains in order to bound the mixing time for random walks. The use of clique 
embeddings to bound eigenvalues arose in the analysis of mixing times for Markov chains by Jerrum and 
Sinclair [14] [17]. Further work in this direction was done by Diaconis and Strook [4] and by Sinclair [16]. 
Kahale [15] generalized this work in terms of methods that assign lengths to the graph edges, and showed 
that the best bound over all edge length assignments is the largest eigenvalue of the matrix T r r, where F 
is a matrix representing the path embedding ([15] also cites unpublished work by Fill and Sokal in these 
directions). He also gave a semidefinite programming formulation for a model allowing fractional paths, and 
showed that the bound is off by at most a factor of log 2 n. He showed this gap is tight; he also noted that 
the results can be applied to bounding A 2 of a Laplacian from below. 

Guattery, Leighton, and Miller [12] presented a lower bound technique for A 2 of a Laplacian. It assigns 
priorities to paths in the embedding, and uses these to compute congestions of edges in the the original 
graph with respect to the embedding. Summing the congestions along the edges in a path gives the path 
congestion; the lower bound is a function of the reciprocal of the maximum path congestion taken over all 
paths. For the clique case, they showed that this method is the dual of the method presented in [15]; the 
best lower bounds produced by these methods are the same. They also showed how to apply their method 
in the Dirichlet boundary case by using star embeddings. In the clique case, they show that using uniform 
priorities for any tree T gives a lower bound that is within a factor proportional to the logarithm of the 
diameter of the tree. 

Guattery and Miller [13] have shown that the bounds discussed in the previous paragraphs are not tight 
because of the problem representation. By incorporating edge directions into the embeddings, they were able 
to show that there exists an embedding (dubbed the current flow embedding) for which there is an exact 
relationship between the largest eigenvalue of r T r and the smallest nontrivial Laplacian eigenvalue. This is 
true both for clique and star embedding cases. 

Gremban [11] has shown how to use embeddings to generate support numbers, which also provide 
bounds on the largest and smallest generalized eigenvalues (and hence the spectral condition number) of 
preconditioned linear systems involving a generalized definition of Laplacians. This work is reviewed in 
Section 4 below. He also defined the support tree preconditioner, and used the support number bounds to 
prove properties about the quality of these preconditioners. Gremban, Miller, and Zagha have evaluated the 
performance of these techniques [10]. 

3. Notation and Terminology. 

3.1. Matrices. All matrices considered in this paper are real matrices. We use capital letters (e.g. A) 
to represent matrices and bold lower case letters to represent vectors (e.g., x). 

A matrix A is diagonally dominant if all diagonal entries are positive (a** > 0), and for every row i, 
a « ^ Sjyi \ a ij\' If the inequality in the second condition is strict for all rows, then the matrix is strictly 
diagonally dominant. If A is irreducible and the second condition is strict for at least one row z, then the 
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matrix is irreducibly diagonally dominant 

3.2. Generalized Laplacian Matrices and Graphs. Definition 3.1. A is a generalized Laplacian 
matrix if and only if: 

• A is symmetric; 

• all diagonal entries an > 0; 

• A is diagonally dominant. 

To insure positive definiteness, we will assume that the matrix is irreducibly diagonally dominant. Note 
that if the matrix is positive definite and reducible, we can break the problem into smaller pieces of the 
desired form. 

Generalized Laplacians correspond to graphs with positive edge weights according to the following rules: 

• Each row (or column) corresponds to a vertex. 

• Nonzero off-diagonal entries correspond to edges. That is, for i ^ j, if ^ 0, then there is an edge 
between vertices v t and Vj with weight — a tJ . 

• The diagonal entry an is the sum of the weights of the edges incident to vertex If a diagonal 
entry is greater than the sum of the incident edge weights, there is an additional edge from that 
vertex to an implicit zero- valued boundary vertex. While this vertex is implicit with respect to the 
matrix, we will represent it explicitly with respect to the graph. 

When necessary, we use the following notation to relate graphs and matrices: For a Laplacian A, the 
associated graph is G(A). The generalized Laplacian of a graph G is denoted L(G). 

The following property is a useful consequence of interpreting a Laplacian as a graph: Let A be a 
Laplacian with associated graph G. Recall that — is the weight of edge (vi,Vj). For all x, 

(3.1) x T Ax= ^2 ~ a %j( x i — x j) 2 - 

(vi,Vj)eE(G) 

(Edges represented by surpluses on the diagonal are included in this sum by using 0 for the value at the 
(implicit) boundary vertex.) 

3.3. Graph Embeddings. For a graph G , we will use the notation V (G) to represent the set of vertices 
of G, and E(G) to represent the set of edges of G. 

An embedding of H into G is a collection T of path subgraphs of G such that for each edge (u», Vj) G E(H ), 
the embedding contains a simple path 7^ from to v 3 in G. For full generality, we will allow fractional 
paths in our embeddings: i.e., an edge (vi,Vj) G E(H) can be associated with a finite collection of simple 
paths from to Vj in G ; each such path has a positive fractional factor associated with it such that these 
factors add up to 1 . If a path 7 includes edge e, we say that 7 is incident to e. The weight of a path w 7 
is the weight of the corresponding edge in H. In the case of fractional paths, the weight is scaled by the 
corresponding fractional weight. 

The congestion c e of edge e G E(G) is the sum of the weights of the paths incident on e. (In the 
unweighted case, this is just the number of paths that include e.) The congestion of the embedding is the 
maximum edge congestion taken over all edges in G. 

The dilation of an edge / in H is the length (the number of edges) in /’ s path 7 / in the embedding. 
The dilation of the embedding is the maximum dilation taken over all edges in H. 

We note that when we represent matrices by graphs for embedding purposes, the graph representations 
(and hence the embeddings) include the (implicit) edges to the boundary indicated by surpluses on the 
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matrix diagonals. In such cases, the boundary is represented by a single vertex; the weights of the edges to 
the boundary are the diagonal surpluses. 

3.4. Incomplete Factorizations. Incomplete factorization techniques provide one way of constructing 
preconditioners for sparse systems (see the survey by Chan and van der Vorst [3] for a good overview 
of incomplete factorization techniques). Since we are dealing with symmetric matrices, we will focus on 
incomplete Cholesky factorizations. 

To keep work and storage small, incomplete factorization methods limit the amount of fill allowed in the 
factor. One way to do this is to specify the entries where fill is allowed. There are many ways to do this, 
e.g., by level of the fill produced. In general, we can specify a set of allowed fill positions. We represent this 
set by a 0-1 matrix S. By assumption, 5 includes all nonzero entries in A. 

For symmetric matrix A , the incomplete Cholesky factorization algorithm produces a lower triangular 
factor L. For A a symmetric irreducibly diagonally dominant M- matrix, L has a positive diagonal and 
nonpositive off-diagonals. The preconditioner B = LL T disagrees with A only at level (0) fill positions of S . 

The error matrix R = B — A is symmetric with a zero diagonal. When A is a nonsingular generalized 
Laplacian, it is also an M- matrix, and off-diagonal entries are nonnegative. Since we assume the factorization 
is incomplete, there must be positive entries in R. It is easy to see that R is indefinite. Let vector d be the 
product of R and the vector with all entries 1, and let D be the diagonal matrix with the entries of d on 
the diagonal. We can write R as the sum of a positive semidefinite matrix = (R -f D)/2 and a negative 
semidefinite matrix R~ = (R — D)/ 2. 

4. Graph Embedding Techniques for Generalized Laplacians. 

4.1. Preconditioned Conjugate Gradient and the Spectral Condition Number. Consider the 
problem of solving Ax ~ b for x where A is a positive definite generalized Laplacian. This can be done 
using the conjugate gradient method. The number of iterations required for convergence depends on the 
distribution of A’s eigenvalues, and can be complicated to compute (see e.g. [2]). However, an upper bound 
on the convergence rate can be computed from the spectral condition number k(A) = X rnax {A) / X^^A) . 
The rate of convergence can often be improved by applying a preconditioner B and using the preconditioned 
conjugate gradient algorithm. In this case, the rate of convergence depends on the spectrum of B~ l A 
and can be bounded above as a function of k{B~ 1 A) = X max (B~ 1 A) / X rnin (B~ l A) . We assume that the 
preconditioner B is symmetric positive definite in the discussion below. 

The problem of computing k(B~ x A) can be defined in terms of the generalized eigenvalue problem , which 
involves finding all scalars A for which there exists anx^O such that Ax = XBx. Since by assumption B 
is nonsingular, we have 

(4.1) Ax = XBx <-> £T 1 Ax = Ax. 

The ordered pair (A, B) is called a matrix pencil , and we denote an eigenvalue of the pencil by A (A, B). 

In the next section we present a lemma that provides the basis for computing upper bounds on A max (A, B), 
and hence on X max (B~ 1 A). We note that since A and B are by assumption positive definite, 

(4.2) t = AmazKS-Mr 1 ) = a max {A~ l B). 

Xmin\B A) 

Thus, we can apply upper bounds to both terms in the definition of k(B~ 1 A). 
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4.2. The Support Lemma. 

Let A be a symmetric matrix, B be a symmetric positive definite matrix, and let r be a real number. 
Axelsson [1] gives the following lemma: 

Lemma 4.1. IfrB — A is positive semide finite , then X max (B~ 1 A) < r. 

Proof. Let u be an eigenvector of A max (I? _1 A). By (4.1), Au — \Bu. Starting from the assumption 
that tB — A is positive semidefinite, we can deduce the following: 

0 < u t (tB — A)u = (r — A)u T B u. 

Since B is positive definite, this implies that r — A > 0. □ 

Gremban [11] refers to this as the Support Lemma. He defines support as follows: The support a(A, B) 
of matrix B for matrix A is 


min{r : tB — A is positive semidefinite} 

He uses support to find upper bounds on the spectral condition number of preconditioned systems using 
generalized Laplacians. Note that, by the Support Lemma, X rnax (B~ l A) < cr(A,B), and 1/A m i n (B~ 1 A) = 
A Thus 

k{B~ 1 A) = ^ mai( fl 1 ^ ) < *(A, B) • cr(B, A). 

Xmin\L> A} 

Gremban’s method works by decomposing A into k pieces Ai, A 2 , . . . , A* such that Ai = A. 

Likewise, suppose B can be decomposed into k positive semidefinite pieces. Assume that we have a set 
{ti, 72 , . ■ • ,t&} such that r t Bi — A* is positive semidefinite for all i. Let r* — maxTj. Then t*B x — A t is 
positive semidefinite for all i, and 

k k k 

- Ai) = r* B ' ~ 12 A < = t * B - A 

i=l t=l i— 1 

By hnearity, r*B — A is positive semidefinite. 

For generalized Laplacians the decompositions of A and B can be based on graph embeddings. Let k 
be the number of edges in G(A). We will decompose B — A into pieces Bi — A,, where A* is the Laplacian 
of the graph on V(G(A)) consisting of only edge e a e E(G(A)). Bi is the Laplacian of the (appropriately 
weighted) corresponding path in G(B). 

To determine the appropriate weighting of the paths, note that an edge e in G(B) may show up in 
multiple paths in the decomposition. The edge must be divided up so that the weights of the pieces of e on 
various paths sum to w e , the weight of e. Let c e be the congestion of edge e in B\ let wj be the weight of 
edge / in A. Assume that the path for / includes e. The amount of weight from w e assigned to the path 
associated with / is 

w f 
Ce ' 

(In the unweighted case, this will be just ^-.) 

Note that if there are edges in G(B) that do not occur in any path, they can be separated out into a 
component that can support an empty component of A. Thus they do not affect the rest of the calculation. 

The decomposition given above reduces the problem of finding a r that is an upper bound of <r(A, B) 
to the problem of computing Ti s for a number of problems that consist of supporting an edge with a path. 
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4,3. The Path Problem and Electrical Circuits. Now consider the problem of a path 7 in G(B ) 
supporting an edge e in G{A). Assume the path length (the dilation for edge e) is j, and let w e be the weight 
of e. For simplicity of notation, we reindex the vertices from 1 to j + 1 according to their order along the 
path. Let c* be the congestion on edge (i, i + 1 ) of the path. The weight of edge i in the path is w e /ci . 
Using (3.1), we can state the path problem as follows: Choose r 7 such that 

3 

T-y £ — (*i - Zi+l ) 2 > W e (xi - X i+ i) 2 , 

<= 1 ^ 

or, cancelling the common factor ic e , 

* x 

r 7 2J “ ( x * ~ *i+i) 2 > (^1 - Xj+ 1) 2 - 

? — 1 

The path problem looks like the power problem in a series resistive circuit 1 . In particular, the entries in 
x correspond to voltages at path nodes, and each value — corresponds to the conductance between nodes i 

a 

and i + 1. Since conductances are reciprocals of resistances, the congestion of an edge can be thought of as 
its resistance. Thus, we can restate the path problem as follows: Given voltages at the ends of the circuit, 
what voltages at the internal nodes produce the minimum power dissipation? 

We construct a series resistive circuit corresponding to path 7 as follows: For edge i on the path, assign 
a resistor with resistance r t = c*. Define the path resistance r 7 as the sum of the resistances on the path: 

t=l i=l 

The following theorem is well-known (see e.g. [5]): 

Theorem 4.2. For any x, 

r-y Y] — (Xj - x i+1 f > (si - x j+ i) 2 . 

i=i Ci 


Proof. We can rewrite the left-hand side of the inequality in terms of congestions as follows: 


. — 1 »—i ^ 


Rewriting slightly and applying Cauchy- Schwarz gives the following: 


j 3 


£(^) 2 £ 


( v ^) 2 




> 



= (Xi - £?+i) 2 . 


The last inequality follows because the sum telescopes. This proves the theorem. □ 

Thus, for any path problem, it is sufficient to set r 7 = r 7 , the sum of the congestions along the path. 
The maximum path resistance taken over all paths in T is denoted r max . The corresponding path is called 
the critical path. By the partitioning argument in Section 4.2, setting r* = r max insures that r* is an upper 
bound on A max (B~ 1 A). 

Gremban suggests an easier way of computing a sufficiently large r based on the following two facts: 


1 Laplacians can be used to represent resistive circuits, where the off-diagonal entries represent conductances between nodes 
in the circuit. 
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• For any path 7 in the embedding, setting all congestions equal to the maximum congestion on the 
path only increases r 7 , which thus remains an upper bound. 

• If all congestions on the path have value c, then the path resistance of 7 (and hence r 7 ) is the 
product of the path length times c. 

Thus, the path resistance of any path is less than the product of the path’s length times the maximum 
congestion on any path edge. This product is bounded above for any path by the product of the embedding’s 
congestion times its dilation. 

Though the product of the congestion of the embedding times dilation can be greater than the value of 
r derived by looking at all paths, in many interesting cases the difference is not more than a constant factor. 
Since this product is often much easier to compute, it is a useful simplification. 

4.4. An Extension for Positive Off-Diagonal Entries. Gremban gives an extension that allows the 
embedding techniques to be used for diagonally dominant symmetric matrices that have positive off-diagonal 
entries. We refer to such matrices as extended Laplacians (Gremban calls them generalized Laplacians, but 
this conflicts with Fiedler’s definition of that term, which we use). 

Note that any extended Laplacian A can be written as the sum of a diagonal matrix D equal to A " s 
diagonal, a matrix 0~ containing the negative off-diagonal entries, and a matrix 0 + containing the positive 
off-diagonal entries. The expansion A exp of A has the following block form: 


A — 

^exp — 


D + 0~ 

-0+ 


- 0 + 

D + 0~ 


The following lemma is a restatement of Lemma 7.3 from Gremban’s thesis [11]. Let A be an extended 
Laplacian and x any vector. 

Lemma 4.3. 


Ax = b if and only if A exp 


x 
— x 


b 

-b 


The proof is straightforward and left to the reader. 

It is easy to show that applying the support lemma to the expansions of extended Laplacians yields 
upper bounds on the eigenvalues of the original matrices. Let A and B be extended Laplacians. 

Lemma 4.4. If r is a nonnegative number such that rS exp — A exp is positive semidefinite, then tB — A 
is also positive semidefinite. 

Proof. Let r be small enough that rB — A is not positive semidefinite. Then there is vector u such that 
u T (rB — A) u < 0. Let v = (rB — A) u; clearly u r v < 0 . By Lemma 4.3, 


u' 


11 ] (rB exp A exp ) 


u 

r .T 

Ti 

V 

— u 

= [u , 

-U ] 

— V 


< 0 . 


Thus, rB exp — A exp is not positive semidefinite either. □ 

As a result, we can apply embedding techniques to the expansions of extended Laplacians to get upper 
bounds that apply to the extended Laplacians. 

5. Bounds on the Condition Number for Incomplete Factorizations. The techniques from 
the previous section do not accommodate matrices that are not diagonally dominant, and thus are often 
unsuitable for preconditioners based on incomplete factorizations. More specifically, let A be a positive 
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definite generalized Laplacian, let L be the lower triangular incomplete factor of A, and let B = LL T = A+R 
be the resulting preconditioner. The matrix R adds positive off-diagonal entries without changing the 
diagonal. Typically this means there are zero-sum rows in A (i.e., nodes not adjacent to the boundary) 
whose corresponding rows in B have deficiencies on the diagonal. 

Let B = A -f R be the preconditioner formed from the incomplete factorization of A. We want an upper 
bound on the condition number 




A max ((A + iQ-M) 


Since both A and A + R are symmetric positive definite, we can rewrite this as 


« ((>1 + R)~ l A) = A mox {(A + R^A) • A mai (. A-\A + R)) . 

We first consider X max ((A + R) 1 A) . By Lemma 4.1, any r such that t(A + R) — A is positive semidef- 
inite is an upper bound. If R were positive semidefinite, r = 1 would work. However, R is indefinite, and A 
must support R as well as itself. This suggests splitting A into two parts: if we can find a positive a < 1 
such that a A + R is positive semidefinite, we can rewrite the expression in r as follows: 


t(A + R) — A = r(aA + R) + r( 1 — a)A — A. 

If such an a exists, r(aA + R) will be positive semidefinite for any nonnegative r, and it will suffice to find r 
such that r(l— a) A— A is positive semidefinite. Thus r > w *ll an upper bound on A max (( A + R) _1 A ) . 
The following lemma shows the existence of such an a. 

Lemma 5.1. For A and R as defined above , there exists a positive a < 1 such that a A + R is positive 
semidefinite. 

Proof. It is easy to see that 

X T (4 + R)X > Xmin(A + R)X T K > + ^ 

Xmax (,^1) 

Since A + R is positive definite, A m i n (yl + R) > 0, and X m in(A + R) / A moz (>l) is positive. It is the case that 

Amin {A + R)< 

Amoi (A) + X min (R)< X max (A); 


the first inequality follows from a well-known result (see e.g. Corollary 8.1.3 on p. 411 of Golub and Van 
Loan [9]), and the second from the fact that A m i n (i?) is negative. Thus there exists an o: such that 


0 < a < 1 - 


Ami«(^ + R ) 

Amax (A) 


0 

It is often possible to calculate a reasonable a using embedding techniques. However, since R is indefinite, 
some manipulation is required. The expression aA + R can be rewritten as aA + + R ~ . Since R+ is 

positive semidefinite, it suffices to find an a such that aA+R~ is positive semidefinite. This can be rewritten 
as a A — (—R~); since R~ is the negative of a Laplacian, embedding techniques can now be applied. 

To get an upper bound for A m ax + R)) we need to find a r such that rA — (A + R) is positive 

semidefinite. We can again rewrite in terms of R+ and R ~ : 


rA-(A + R) = (r- 1 )A -R- - R+. 
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Since — R~ is positive semidefinite, it suffices to find r such that (r — 1)A — R + is positive semidefinite. This 
can be done using embedding techniques, though it requires using the expansions of the Laplacians of A and 
R+ because R~ f has positive off-diagonal entries. Note that the expansion of A will be two copies of A , and 
that the expansion of R + will have edges that connect those two copies. The embedding of R^ into A has 
to use the edges between each copy of A and the vertex representing the boundary. 

6. Applications. 

6-1. Natural Ordering for a Square Grid- To simplify the presentation in this section, we will use 
the same notation for both a matrix and its graph. For example, A will refer both to the matrix A and to 
G(A). The type of object referred to will be clear from context. 

Consider the level(O) incomplete Cholesky factorization of a unit- weight square grid in natural order, 
where every node in the outer perimeter of the grid is connected to a zero Dirichlet boundary. Let A be the 
generalized Laplacian of the grid, L be the lower triangular factor, B = LL T , and R = B — A. Eijkhout [6] 
shows that the maximum entry of R is bounded above by the value 1 — V2/2 . 

To use the embedding techniques to bound the condition number, we first note that the edges in R 
consist of one diagonal across each square face in the graph of A. It is obvious that we can partition A into 
distinct pairs of edges that form paths of length 2 between the endpoints of the edges in R . 

We start with the upper bound on A max ((A + R)~*A) . This involves first embedding —R~ into A. 
By the observation in the preceding paragraph, each edge in R~ is supported by a unique path of length 
2 in A. The congestion on any edge in A is the weight of the edge in — R~ that it has been assigned to 
support. Thus, the path resistance of any path in A supporting an edge e in —R~ is twice the weight of e. 
Recalling that the edge weights in — R~ are half the corresponding values in 7?, we see that the a needed 
for A to support R is the maximum weight entry from R (we denote it by ma x(R)). The upper bound on 
A max ((A + R)~ X A) is r = 1/(1 - a) = 1/(1 - max(R)); for the grid in natural order, this value is y/2. 

Now consider the upper bound on A max (A” 1 (A + R)). A bit of terminology will help make the descrip- 
tion of the embedding simpler: consider the grid laid out on a piece of paper; it has a left side and a right 
side. The edges of R+ are diagonals of the squares in the grid, so each has a left end and a right end. Given 
our ordering, each vertex in the grid is the right end of at most one edge of R+ and the left end of at most 
one. 

Recall that the embedding uses two copies of the grid connected via the node representing the zero 
boundary; the edges of R + run between these copies. We use the following embedding: for each edge in R + , 
we route a path from the left-end vertex in copy 1 to the boundary vertex through the closest perimeter 
vertex on the grid. The path is routed back to the right-end vertex in copy 2 via the perimeter vertex closest 
to the right end. 

Since each grid vertex along the path in copy 1 is the left end of at most one edge in 7? + , each contributes 
at most one unit to the congestion along the path. The longest path (which gives the most congestion) runs 
from vertices at the center of the grid. Their distance from the perimeter is no more than n/2, so by the 
observation above, the number of units of congestion summed along the path to the perimeter is at most 
X^=i By symmetry, the sum along the path back to the right end has the same bound. The edges between 
the perimeter and the vertex representing the zero boundary each have congestion of at most n/2 units. Since 
the edges in R+ have less than unit weight, the congestion must be scaled; multiplying the total units of 
congestion by the maximum element of R + gives an upper bound. Recalling that the off-diagonals in R + are 
half the size of those in R , we have ma x(77 + ) = max(77)/2. Thus we can bound the path congestion above 
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by the following: 


max(i?) 


o n \ / r >\( n2 3 n \ 

2 E Z + 2 • 2 j = max{R) (j + T j • 


Recall that the bound on the desired eigenvalue is actually larger by 1, giving the following: 

Amox (>1 -1 (^4 + R )) < max(R) (y + + 1. 

Combining this with our previous result and the bound on the elements of R gives 

k ((A + R)~ X A ) < |ma x(R) + 1^ / (1 — ma x(R)) 

/ r- \ ( n 2 3 n\ 2 

S^-OU + Tj + iZvJ- 


6.2. Modified Incomplete Factorization. Embedding techniques can also be used in a qualitative 
way to explain why certain methods behave as they do or to examine strategies for producing preconditioners 
based on incomplete factorization. As an example, we will discuss intuitive ideas supporting the use of 
modified incomplete factorizations. In such factorizations, the preconditioner no longer agrees with the 
original matrix at all entries specified in 5. Instead, agreement is enforced at off-diagonal points, and 
diagonal entries are modified so that the row sums of the preconditioner B agree with the row sums of A. 

We start with an example that illustrates the benefits of enforcing consistent row sums. The precon- 
ditioner involved is not practical because computing B~ l x is difficult, but it clearly illustrates the main 
points. Let A be a positive definite generalized Laplacian, L be the lower triangular incomplete factor of A , 
B — LL T , and R = B — A. Let D be a diagonal matrix with du equal to the sum of the elements in row i 
of R . Let B mo d = LL t — D ; that is, B mo d is B modified so that its row sums are the same as those of A. 

What is Let Rmod = Bmod — A, Because of the change in the diagonal, Rmod is the negative 

of a Laplacian: R = Rmod and = 0- Recall that the upper bound on A mox [A~ X (A + Rmod)) is r 

such that (r — l)A — R^od 1S positive semidefinite. Since R^od = 0, r = 1 will do. The upper bound on 
A mflI ((j 4 -f Rmod)~ l A) is r = 1/(1— a), where 0 < a < 1 and a A — R^od 1S positive semidefinite. The size 
of the off-diagonal entries in R rnod are twice the size of those in R~ , so a is twice as big in the modified case 
as it is in the unmodified case. 

This suggests a number of interesting things. First, the decrease in the bound on A max (A -1 (A 4- R)) 
is often substantial because we no longer need to deal with the expanded versions of A and R + . Often 
the number of edges in R + is sufficient to cause large congestion through the boundary vertex, and the 
paths from interior vertices can be relatively long. With the modifications we are considering, these factors 
disappear: In the case of the naturally ordered square grid with boundary as above, the decrease is from a 
value that is B(n 2 ) to a constant factor (1). 

Second, the bound on A max ((A + R)~ l A) increases, though if a is small in the unmodified case, the 
increase is also small. For the model square grid problem, max(H) is less than 1/3. The upper bound on 
A m ax ((A + Rmod)~ l A) therefore at most doubles. 

These observations suggest that this (impractical) modified preconditioner can give large improvements 
in the spectral condition number over the unmodified version. (We note again that these are bounds on the 
spectral condition number, and that the spectral condition number only lets us find an upper bound on the 
convergence rate. However, experiments show that bounds on k are good, and the connection to convergence 
rates is good for this model problem.) 
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The problem with this preconditioner is that it is hard to work with. In particular, we do not know of an 
easy way to compute H _1 x, nor of a way to compute a close approximation of B that is easy to work with. 
However, there are modified incomplete factorization methods that do produce a preconditioner with row 
sums equal to those of A. The discussion above provides some intuition about why these modified incomplete 
Cholesky preconditioners do not give spectral condition numbers as small as ^(D^^A). In particular, when 
the modified factorization algorithm encounters a fill entry (i,j) that is to be dropped (i.e., [S]ij = 0), 
it reduces the diagonal entries for i and j. Because the columns in L are divided by the square roots of 
their diagonal entries, the entries in columns i and j are increased relative to their values in the unmodified 
factor. This increases the size of any fill entries produced by the entries in these columns; if any such fill 
entry is dropped, its corresponding entry in the error matrix of the preconditioner is likewise increased over 
the unmodified case. The dropped fill entries also cause decreases in subsequently ordered diagonal entries, 
allowing the effects to ripple through L. As a result, entries in the error matrix can become substantially 
larger than in the unmodified case. 

In the model grid problem, the largest entries in the error matrix for modified incomplete Cholesky 
factorization approach 1/2, and the bound on A max ((A + Rmod)^ 1 ^) increases by more than a constant 
factor. Experiments suggest that the number of iterations needed for convergence increases as well. 
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